******************************************************************************************************************************************
*This file builds schooling outcomes from the 1970 Census 1% metro sample, since the variable schooling is missing in the 1% state sample*
******************************************************************************************************************************************

use "$raw_data_lmarket/ipums_census_metro_1970.dta", clear

*keep only metro fm2, because it includes the variable of schooling
keep if sample==197004

*Drop institutional group quarters* 
quietly: drop if gqtyped>=100 & gqtyped<=499
*Drop alaska and hawai*
quietly: drop if statefip==2 | statefip==15

keep if age>=19&age<=64

*population
gen ipums_pop = 1

*schooling
foreach var of varlist ipums_* {
	gen `var'_sc = (`var'==1 & school==2)
}

*gender
foreach var of varlist ipums_* {
	gen `var'_m = (`var'==1 & sex==1)
	gen `var'_f = (`var'==1 & sex==2)
}

*race
foreach var of varlist ipums_* {
gen `var'_w = (`var'==1 & race==1 & hispan==0)
gen `var'_nw = (`var'==1 & (race!=1 | hispan!=0))
}

*age
foreach var of varlist ipums_* {
	gen `var'_a19_34 = (`var'==1 & age>=19&age<=34)
}

keep cntygp* statefip ipums_* perwt

**Merge czones using geography xwalk**
gen ctygrp1970=cntygp97
collapse (sum) ipums_* [fw=perwt], by(ctygrp1970)  fast
count if ctygrp1970!=.
joinby ctygrp1970 using "$project/xwalks/xwalks_geography/ctygrp1970_czone.dta", unmatched(master)
assert czone!=. 


**Aggregate at the czone level**
collapse (sum) ipums_* [iw=afac], by(czone) fast
foreach var of varlist ipums_*{
rename `var' `var'_1970
}

save "$clean_data_lmarket/czone1970_school_demographics.dta", replace


